Investigating Performance of the Discriminative Methods for Long-Term Speaker Adaptation

نویسندگان

  • Danning Jiang
  • Dimitri Kanevsky
  • Vaibhava Goel
  • Yong Qin
چکیده

Many of today’s speech recognition applications can benefit from long-term speaker adaptation using speaker logs, and discriminative methods present a promising approach for that given their previous successes. This paper carries out largevocabulary speech recognition experiments to investigate performance of feature-space and model-space discriminative adaptation methods for long-term speaker adaptation. The experimental results suggest that though on average discriminative adaptation does not obtain a big gain over ML adaptation, there are still a number of test speakers that show significant improvements. Motivated by this observation, we further propose an efficient method to automatically select speakers which can obtain big improvements in discriminative adaptation. When 35%~65% of the whole test population are selected for discriminative adaptation, the relative WER reduction over ML adaptation can reach 4%~5% if only these speakers’ performance is inspected.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Wide Residual BLSTM Network with Discriminative Speaker Adaptation for Robust Speech Recognition

We present a system for the 4th CHiME challenge which significantly increases the performance for all three tracks with respect to the provided baseline system. The front-end uses a bidirectional Long Short-Term Memory (BLSTM)-based neural network to estimate signal statistics. These then steer a Generalized Eigenvalue beamformer. The back-end consists of a 22 layer deep Wide Residual Network a...

متن کامل

Augmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone data

Short-term cepstral features have long been chosen as standard features for speaker recognition thanks to their relevance and effectiveness. In contrast, discriminative features, calculated by a multi-layer perceptron (MLP) from much longer stretches of time, have been gradually adopted in automatic speech recognition (ASR). It has been shown that augmenting short-term cepstral features with lo...

متن کامل

Comparison of discriminative training methods for speaker verification

The maximum likelihood estimation (MLE) and Bayesian maximum a-posteriori (MAP) adaptation methods for Gaussian mixture models (GMM) have proven to be effective and efficient for speaker verification, even though each speaker model is trained using only his own training utterances. Discriminative criteria aim at increasing discriminability by using out-of-class data. In this paper, we consider ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012